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Method and System of Audio File Searching 
BACKGROUND 

51. Field of the Present Invention 

The present invention generally relates to the field of digital electronic information, and 
more particularly to a method and system for searching audio files. 

102. History of Related Art 

Audio information is frequently distributed on a storage medium (referred to herein as a 
multimedia storage medium or audio storage medium) such as a compact disc (CD), digital video 
disc (DVD), audio tape, or VCR tape. On such media, audio information is typically arranged in 

15a sequential fashion. Locating a particular portion of the audio information typically requires the 
user to advance (or reverse) through the media under manual control in an attempt to locate the 
precise location containing the desired information. Typically, however, the user's ability to 
rapidly locate a desired portion of the audio content is significantly limited. In an application 
where, for example, music is stored on a CD, the user is usually able only to advance to a 

20pre-determined number of locations within the CD, namely, the beginning of each song on the 
CD. Within a particular song, the user may have the ability to advance the disc by a specified 
amount, but the audio output is typically disabled while the disc is advanced making it difficult to 
locate quickly a precise point in the song. Similarly, many consumers have had the experience of 
fast forwarding an audio tape or VCR to find a particular location in the tape. Typically, the user 

25must respond reactively to media content that is flashing across a television screen or coming 
from a speaker at an unintelligible rate resulting in a back and forth search process that is time 
consuming, annoying, and potentially detrimental to the media player as its mechanism are 
rapidly altered from fast forward and reverse settings to a play setting. 
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SUMMARY OF THE INVENTION 

The problems identified above are in large part addressed by a system for locating an 
Saudio segment within a storage device. The system relieves a user from the tedious fast-forward, 
reverse, and playback process typically employed to manually search for a desired location within 
a media. In addition, the automated searching process disclosed herein is capable of processing 
information much faster than is possible using manual searching techniques. A 40X CD-ROM 
device, for example, could search a CD for a given input sequence at a speed far greater than the 

lOgreatest speed detectable with the human ear or eye. The system includes an input device 
suitable for transmitting an input sample that is indicative of the audio segment and a media 
player suitable for playing audio information stored on the storage device. The system further 
includes a sample converter configured to generate an input sample diphthong sequence in 
response to receiving the input sample from the input device. The input sample diphthong 

ISsequence may comprise a digital representation of the diphthong components of the input 
sample. An audio converter of the system is configured to generate an audio content diphthong 
sequence. The audio content diphthong sequence may comprise a digital representation of the 
diphthong components of the audio information on the storage device. The system may further 
include a comparator configured to detect a match between the input sample diphthong sequence 

20and a portion of the audio content diphthong sequence. In one embodiment, the input device may 
be a keyboard and the input sample may be a text sample. In another embodiment, the input 
device may be a microphone and the input sample may be an audio message. In one 
embodiment, the comparator is further configured to produce a signal that indicates the location 
within the storage device of the matching portion of the audio content diphthong sequence. A 

25media player may be configured to receive the location signal from the comparator and to 
advance the storage device to the location indicated by the location signal. The storage device 
may comprise a compact disc, a digital video disc, a VCR, an audio tape, or other storage device 
suitable for storing the input sequence. 
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The invention further contemplates a method of operating a multimedia or audio storage 
device player system in which an input sample is converted to a first sequence of diphthongs. An 
audio segment within a storage device is then located, where the diphthong components of the 
Saudio segment and the first sequence of diphthongs satisfy match criteria. The storage device 
may then be advanced to the location of matching audio segment. In one embodiment, 
converting the input sample to a first sequence comprises converting a text sample to its 
component diphthongs, while, in another embodiment, converting the input sample to the first 
sequence includes converting an audio sample to its component diphthongs. Locating the audio 
lOsegment may include converting the audio content of the storage device to a second sequence of 
diphthongs and comparing the first and second sequences of diphthongs for a match. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 Other objects and advantages of the invention will become apparent upon reading the 

following detailed description and upon reference to the accompanying drawings in which; 

FIG 1 is a block diagram of a system for locating a selected audio segment on a storage 
medium according to one embodiment of the present invention; 

20 

FIG 2 is a block diagram of a sample comparator of the system of FIG 1 according to one 
embodiment of the invention; 

FIG 3 is a block diagram of a data processing system suitable for implementing the 
25sample comparator of FIG 2; 

FIG 4 is a flow diagram of a method of searching for an audio segment according to one 
embodiment of the invention; and 
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FIG 5 is a block diagram of a sample comparator according to one embodiment of the 
invention. 

5 While the invention is susceptible to various modifications and alternative forms, specific 
embodiments thereof are shown by way of example in the drawings and will herein be described 
in detail. It should be understood, however, that the drawings and detailed description presented 
herein are not intended to limit the invention to the particular embodiment disclosed, but on the 
contrary, the intention is to cover all modifications, equivalents, and alternatives falling within 
lOthe spirit and scope of the present invention as defined by the appended claims. 

DETAILED DESCRIPTION OF THE DRAWINGS 

Turning now to the drawings, FIG 1 illustrates a system 100 for searching audio 
ISinformation to find an instance of a specified portion of audio content. In the depicted 
embodiment, system 100 includes an input device such as a microphone 102 or a keyboard 104 
connected to a sample comparator 106. The input device is suitable for transmitting an audio or 
text input sample to sample comparator 106. Although the depicted embodiment indicates both a 
keyboard 104 and a microphone 102, system 100 may be implemented with just a single input 
20device. Sample comparator 106 conmiunicates with a media player 108 that is suitable for 
playing the content of an audio or multi-media storage device 109 (referred to herein simply as 
storage device 109) such as a compact disc (CD), digital video disc (DVD), VCR, or audio tape. 
Sample comparator 106 is preferably configured to deconstruct the text or audio input sample 
into a sequence of component pieces. The sequence is then used as the basis to search the 
25content of a suitable storage device 109 for a matching sequence as defined by a specified set of 
match criteria. Upon detecting a match, one embodiment of system 100 is configured to advance 
storage device 109 to the matching entry in storage device 109. In this manner, system 100 
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enables a user to search automatically through a large audio file to find specified content and to 
set the media player at the location of storage device 109 containing the specified content. 

Turning now to FIG 2 additional detail of sample comparator 106 according to one 
Sembodiment of the invention is presented. In the depicted embodiment, sample comparator 106 
includes a sample converter 104 that is configured to receive specified audio content indicated by 
reference numeral 103 and referred to for purposes of this disclosure as an input sample. Input 
sample 103 may comprise audio content such as a portion of a spoken message or text content 
generated with a keyboard. In either embodiment, sample converter 104 is suitable for 

lOgenerating, from input sample 103, a sequence of monosyllabic speech sounds referred to herein 
as diphthongs. Diphthongs are combined to form all of the words in a spoken language. The 
number of diphthongs required to form the vast majority of words used in spoken languages, 
such as English, is relatively small thereby enabling the creation of a very large number of words 
from a relatively small number of diphthongs. The sequence of diphthongs generated by sample 

ISconverter 104 represents the input message 103. In an embodiment in which input sample 103 
comprises audio information received via microphone 102, sample converter 104 utilizes any of 
a variety of speech recognition techniques to transform a spoken input sample 103 into its 
component diphthongs. Sample converter 104 may then assign a digital value to each of the 
diphthongs that form the spoken input sample 103 to form a sequence of digital values that are 

20indicative of their corresponding diphthongs. The sequence of digital values generated by 
sample converter 104 is identified in FIG 2 by reference numeral 105 and referred to herein 
simply as diphthongs 105 or diphthong sequence 105. Thus, sample converter 104 of sample 
comparator 106 is adapted to generate a diphthong sequence 105 that represents and is indicative 
of the audio content of the input sample 103. In an embodiment in which input sample 103 

25comprises text information, sample converter 104 may generate diphthongs 105 based on an 
exact approach, using a diphthong database, or on a heuristic approach. These approaches are 
disclosed in a co-pending patent application of Baumgartner et al., entitled Generating 
Multimedia Information from Text Information Using Customized Dictionaries, which shares an 
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assignee with the present invention and is incorporated by reference herein. As indicated in FIG 
2, the diphthong sequence 105 generated by sample converter 104 are forwarded to a string 
comparator 130. 

5 Turning momentarily to FIG 5, an embodiment of sample comparator 106 is depicted in 

which a digitized representation of input sample 103 is compared directly with the digitized 
representation of audio data 120 without extracting diphthong information as is done in the 
embodiment of sample comparator 106 depicted in FIG 2. Instead, a first digitizer 504 generates 
a digitized representation of the audio (or audio video) content of input sample 103. This 

lOdigitized representation (represented by reference numeral 505), is received by a comparator 530. 
Similarly, audio data 120 is digitized by a digitizer 522 (which may or may not comprise the 
same digitizer as digitizer 504) to generate a digitized representation of audio data 120 as 
indicated by reference numeral 525, which is also received by comparator 530. Comparator 530 
then compares digitized sample 505 with digitized data 525 to determine if a match exists 

ISbetween the two digitized data files. This embodiment may be suitably employed in an 
embodiment in which the input sample 103 comprises a "reaf sample, such as a Beethoven 
concerto segment or other type of audio content that is not readily representable by a text or 
speech segment. 

20 In one embodiment, comparator 530 includes hardware and software suitable for 
performing a fast Fourier transform (FFT) on digitized sample 505 and digitized data 525. In this 
embodiment, comparator 530 further includes software suitable for performing a correlation 
function to check for a match in the frequency domain between digitized sample 505 and 
digitized data 525. In one embodiment, segments or "windows" of audio data 520 are 

25transformed to the frequency domain by the FFT capabilities of comparator 530 and then 
compared with a frequency domain representation of digitized sample 505 (also generated by 
comparator 530). Each of these windows represents a time slice of audio data 120. In one 
embodiment, each window corresponds to a time slice of audio data 120 that is comparable in 
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length to the length of input sample 103, although the length of the window is preferably 
alterable by the user. 

In one embodiment, overlapping windows are sampled to increase the probability of 
Scapturing the portion of audio data 120 that matches input sample 103 within a single window. 
For example, one embodiment might include time slice windows that have a length of T seconds, 
where T is approximately equal to the length of input sample 103, and might sample audio data 
120 every T/N seconds, where N is an integer greater than 0. If, as an example, input sample 103 
is approximately 10 seconds long, the time slice window T might be 10 seconds as well. For 

10N=2, ten second time slices would be sampled every T/N = 5 seconds. Thus, each ten second 
time slice would overlap its neighboring time slice by five seconds. Assuming that audio data 
120 contains at least one match to input sample 103, this implementation would guarantee that at 
least 75% of the matching segment of audio data 120 would lie within a single time slice. If 
greater accuracy is required, N can be increased. One embodiment, might include multiple 

ISiterations where the first iteration uses a relatively low value for N to identify windows of audio 
data 120 that might contain a match to input sample 103. These identified windows of audio 
data 120 could then be sampled during a subsequent iteration using a higher value of to achieve 
greater accuracy. 

20 Returning now to the embodiment depicted in FIG 2, sample comparator 106 further 
includes an audio converter 122 that is adapted to parse audio information from the storage 
device 109. (The audio content of storage device 109 is identified as audio data 120 in FIG 2). 
Audio converter 122 may include an audio decoder capable of processing, as examples, MPEG 
or linear PCM encoded bit streams, wav files, etc. In addition, audio converter 122 may include 

25an analog-to-digital converter enabling converter 122 to accept analog audio data from an audio 
tape or the audio track of a VCR. Audio converter 122 generates a sequence of diphthong 
information indicated by reference numeral 125 that is representative of the content of audio data 
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120. Like the input sample diphthong sequence 105, the audio data diphthong sequence 125 may 
be comprised of a set or sequence of digital values, each corresponding to a particular diphthong. 

Li the depicted embodiment, input sample diphthong sequence 105 is received by a 
Scomparator 130. Comparator 130 is adapted to search the audio data diphthong sequence 125 for 
a match with input sample diphthong sequence 105. By converting input sample 103 and audio 
data 120 to a conamon format, namely, a diphthong format, comparator 130 may be implemented 
as a conventional string comparator that utilizes standard pattem matching algorithms. When a 
match is detected between input sample diphthong sequence 105 and a portion of audio data 

lOdiphthong sequence 125, the depicted embodiment of string comparator 130 generates a signal 
132 that is received by media player 108. The signal 132 preferably indicates the location within 
storage device 109 where the audio segment in audio data 120 that matches input sample 103 is 
found. In one embodiment, media player 108 responds to signal 132 by forwarding the 
multi-media storage device 109 to the location indicated by signal 132 such that media player 

15108 may immediately begin playing at the desired location. 

In one embodiment, string comparator 130 may utilize match criteria that find and report 
the location of exact matches between input sample diphthong sequence 105 and audio data 
diphthong sequence 125. In another embodiment, system 100 employs match criteria that permit 

20the use of "fuzzy pattem matching" to desensitize system 100 to variations in 
speech-to-diphthong conversion technology and to allow the use of partial phrases. Fuzzy 
pattem matching algorithms are used in a variety of contexts including, as an example, 
"suggestion" generators for spelling checker applications. Additional information relative to 
fuzzy pattem matching algorithms is available in J. C. Bezdek & S. K. Pal (Ed.), Fuzzy Models 

25for Pattern Recognition: Methods That Search for Structures in Data (IEEE; August 1992) 
ISBN: 0780304225, which is incorporated by reference herein. In one embodiment utilizing 
fuzzy pattem matching, the user is permitted to specify wildcards to further narrow down the 
search results. Imagine for example, a user is searching for an occurrence of the quote "all work 
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and no play makes Jack a dull boy." If the user recalls only that the phrase begins with "all 
work" and ends with "dull boy/' one embodiment of the invention permits the placement of 
either a text or an audio wildcard between the phrase fragments "all work" and "dull boy" to 
narrow the search beyond the scope of searching either phrase fragment on its own. The 
Swildcard may place additional restrictions on the search results such that, for example, all phrase 
fragments must be located within a specified number of diphthongs of one another. In another 
embodiment, not explicitly shown in the drawings, sample converter 104 and audio converter 
122 may generate text files in lieu of diphthong sequences. In this embodiment, sample 
converter 104 and audio converter 122 may employ speech-to-text software suitable for creating 
lOthe text files from audio input. Comparator 130 would then search the text file representing 
audio data 120 for a match with the text file representing the input sample 103. 

In one embodiment, a properly configured microprocessor-based computing device may 
be used to implement system 100. Turning momentarily to FIG 3, selected components of such a 

IScomputing device are indicated by reference numeral 200. In the depicted embodiment, 
computing device 200 includes one or more processors 201 connected to a system memory 202 
via a system bus 204. Any of a variety of commercially distributed microprocessors may be used 
as processors 201 including, as examples, PowerPC® processors from IBM Corporation, Sparc® 
Microprocessors from Sun Microsystems, and x86 compatible microprocessors such as 

20Pentium® processors from Intel Corporation and Athlon® processors from Advanced Micro 
Devices. Computing device 200 may further include one or more bridges 208 for providing 
communication between system bus 204 and a peripheral bus 206. The one or more peripheral 
busses 206 may be compliant with industry standard peripheral busses including, as examples, 
the Industry Standard Architecture (ISA), the Extended Industry Standard Architecture (EISA), 

25the Accelerated Graphics Port (AGP), and the Peripheral Component Interface (PCI) as specified 
in the PCI Local Bus Specification Rev. 2.2 available from the PCI Special Interest Group at 
www.pcisig.org and incorporated by reference herein. The depicted embodiment of computing 
device 200 further includes suitable input devices such as keyboard 210 and pointing device 212 
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connected to peripheral bus 206 via an I/O adapter 214. Computing device 200 may further 
include output devices including speaker 110 connected to peripheral bus 206 via audio adapter 
216 and a display device 222 connected to peripheral bus 222 via a graphics adapter 218. In one 
embodiment, computer device 200 may comprise a conventional desktop or laptop personal 
Scomputer that is connected to media player 108 through an appropriate connection. In another 
embodiment, system 200 may comprise an embedded data processing system within media 
player 108. Portions of system 100, such as sample converter 104, audio converter 122, and 
string comparator 130, may be implemented as a set of instructions stored on a computer 
readable medium such as system memory 202 of computer device 200, a hard disk, floppy disk, 
lOCD ROM, magnetic tape, or other storage facility. In this implementation, the set of computer 
instructions are suitable for execution by processor(s) 201 of system 200 or by another suitable 
processor or controller. 

Turning now to FIG 4, a flow diagram illustrating a method 140 of searching a storage 
ISdevice for specified audio content is depicted. The method 140 enables a user to quickly and 
automatically locate a desired point in a storage device containing audio content. The method 
improves on the cumbersome and time consuming method by which a user is typically required 
to advance through a multimedia storage device attempting to locate a specific passage or 
location. In the embodiment depicted in FIG 4, an input sample is initially detected in step 142. 
20The input sample, as discussed previously, may be an audio segment that is spoken by the user or 
a text segment that is typed or otherwise written by the user. Alternatively, the input may 
comprise an audio or audio- video sample stored on a storage media. As an example, the user 
may have a small audio or audio-video segment on an analog tape as the input sample. In this 
embodiment, the media player 108 depicted in FIG 1 may serve as the input device as well as the 
25device used to transmit audio data 120 to audio converter 122. In any event, the input sample 
indicates (in either an exact manner or in a 'Tuzzy" manner) the audio content of the storage 
device for which the user is searching. Upon detecting the input sample, an input sample 
diphthong sequence (the input sequence) is constructed in step 144 with a sample converter that 
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is configured to receive the input message in the form of a text file in the case of a typed input 
sample, a digitized representation of an audio message in the case of a spoken input sample, or 
both. In parallel with the construction of the input sequence, audio data from the multimedia 
storage device is processed or encoded in step 146 to produce an audio content diphthong 
Ssequence. The encoding of the audio data may occur either before, during, or after the 
construction of the input sequence in step 144. The input sequence is then used (in conjunction 
with specified match criteria) to search (step 148) the audio data diphthong sequence. If no 
match between the input sample diphthong sequence and the audio data diphthong sequence is 
detected in step 152, a message indicating that no match occurred is generated in step 150. If a 

lOmatch is detected, the depicted embodiment of method 140 includes a step 154 in which the user 
is prompted to indicate whether the matching is the entry that user was searching for (in case the 
multimedia storage device includes multiple occurrences of the storage information). If the user 
indicates that the matching entry is the correct entry, the multimedia storage device is advanced 
(step 156) to the matched entry. If the user indicates that the matching entry is not the correct 

1 Sentry, the method returns to searching step 148 to find the next occurrence of the input sample in 
step 148. In one embodiment, the production of the audio content diphthong sequence and the 
searching of the sequence occur in a "handshaking" fashion. In this embodiment, as diphthong 
sequences are generated in step 146 by the converter, they are forwarded to the comparator and 
searched in step 148. If the comparator detects a match, it sends a command to the media player, 

20such as an audio tape player, to stop and to rewind by the appropriate amount to the beginning of 
the matching segment. The rewinding can be handled by sending offset information to the 
comparator with each diphthong. When the comparator detects a match, the offset information 
can be re-sent to the media player to indicate the beginning location of the segment upon 
determining that the segment matches the input sequence. This handshaking embodiment 

25beneficially requires less memory by eliminating the need to save the contents of the entire media 
device until the search process is initiated. In addition, by detecting matching diphthong 
sequences as they are generated, the media device will be at or near the physical location of the 
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matching sequence when it is detected thereby eUminating the need to rewind or fast-forward the 
media player by a significant amount. 

It will be apparent to those skilled in the art having the benefit of this disclosure that the 
Spresent invention contemplates a system and method for locating content within a multimedia or 
audio storage device. It is understood that the form of the invention shown and described in the 
detailed description and the drawings are to be taken merely as presently preferred examples. It 
is intended that the following claims be interpreted broadly to embrace all the variations of the 
preferred embodiments disclosed 

10 
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WHAT IS CLAIMED IS: 

1. A system for locating an audio segment within a storage device, comprising: 

5 an input device suitable for transmitting an input sample indicative of the audio segment; 

a media player suitable for playing audio content stored on the storage device; 

a sample converter configured to generate an input sample diphthong sequence in 
10 response to receiving the input sample from the input device, wherein the input sample 
diphthong sequence comprises a digital representation of the diphthong components of 
the input sample; 

an audio converter configured to generate an audio content diphthong sequence 
15 comprising a digital representation of the diphthong components of the audio content of 

the storage device; and 

a comparator configured to detect a match between the input sample diphthong sequence 
and a portion of the audio content diphthong sequence. 

20 

2. The system of claim 1, wherein the input device comprises a keyboard and the input sample 
comprises text. 

3. The system of claim 1, wherein the input device comprises a microphone and the input sample 
25comprises an audio message. 

4. The system of claim 1, wherein the input device comprises the media player and the input 
sample comprises information recorded on a storage media. 
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5. The system of claim 1, wherein the comparator is further configured to produce a signal 
indicative of the location within the storage device of the matching portion of the audio content 
diphthong sequence. 

5 

6. The system of claim 5, further comprising a media player configured to receive the location 
signal from the comparator and to advance the storage device to the location indicated by the 
location signal. 

107. The system of claim 1, wherein the storage medium comprises a compact disc. 

8. The system of claim 1, wherein the storage medium comprises a digital video disc. 

9. A method of operating a multimedia storage device player system, comprising: 

15 

converting an audio input sample to a digitized representation of the input sample; and 

locating a matching audio segment within audio data stored on a storage device, wherein 
a digitized representation of the audio segment and the digitized representation of the 
20 input sample satisfy match criteria. 

10. The method of claim 9, further comprising, advancing the storage device to the location of 
matching audio segment. 

2511. The method of claim 9, further comprising transforming the input sample to a frequency 
domain representation of the input sample and transforming a portion of the audio data to a 
frequency domain representation of the portion, wherein locating a matching segment includes 
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correlating the input sample frequency domain representation to the audio data frequency domain 
representation. 

12. The method of claim 11, wherein transforming the input sample and the audio data segment 
Scomprises a Fourier transform. 

13. The method of claim 9, wherein converting the input sample to its digitized representation 
comprises the sample to a first sequence of diphthongs and further wherein locating the audio 
segment includes converting the audio content of the storage device to a second sequence of 

lOdiphthongs and comparing the first and second sequences of diphthongs for a match. 

14. The method of claim 13, wherein converting the audio input sample comprises converting 
the audio input sample to a first text file, and further wherein locating the matching audio 
segment comprises converting the audio content on the storage device to a second text file. 

15 

15. A computer program product for locating an audio segment in a storage device, the computer 
program product comprising a computer readable medium configured with processor executable 
instructions, comprising: 

20 first converter means for generating a first diphthong sequence responsive to receiving an 

input sample, wherein the first diphthong sequence is indicative of the input sample; 

second converter means for generating a second diphthong sequence from audio 
information stored on the storage device; and 

25 

comparator means for locating a portion of the second diphthong sequence, wherein the 
located portion of the second diphthong sequence and the first diphthong sequence match 
according to a specified set of match criteria. 
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16. The computer program product of claim 15, wherein the input sample comprises a text 
sample. 

517. The computer program product of claim 15, wherein the input sample comprises an audio 
sample. 

18. The computer program product of claim 15, wherein the comparator means includes means 
for indicating the location within the storage device of the audio information corresponding to 

lOthe second diphthong sequence. 

19. The computer program product of claim 15, wherein the match criteria require exact match 
between the first and second diphthong sequence. 

1520. The computer program product of claim 15, wherein the match criteria are fuzzy criteria. 

21. The computer program product of claim 15, wherein the computer readable medium 
comprises a storage medium is one of a floppy diskette, hard disk, CD ROM, or magnetic tape. 
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Method and System of Audio File Searching 

ABSTRACT 

5 A system, method, and computer program product for locating an audio segment within a 

storage device are disclosed. The system includes an input device suitable for transmitting an 
input sample that is indicative of the audio segment and a media player suitable for playing audio 
information stored on the storage device. The system further includes a sample converter 
configured to generate a digitized representation of the input sample and a digitized 

lOrepresentation of the audio information on the storage device. The digitized representation of the 
input sample may comprise a diphthong sequence indicative of the diphthong components of the 
input sample. Li this embodiment, an audio converter of the system is configured to generate an 
audio content diphthong sequence. The audio content diphthong sequence may comprise a 
digital representation of the diphthong components of the audio information on the storage 

ISdevice. The system may further include a comparator configured to detect a match between the 
input sample diphthong sequence and a portion of the audio content diphthong sequence. Li one 
embodiment, the input device may be a keyboard and the input sample may be a text sample, hi 
another embodiment, the input device may be a microphone and the input sample may be an 
audio message, hi one embodiment, the comparator is further configured to produce a signal that 

20indicates the location within the storage device of the matching portion of the audio content 
diphthong sequence. A media player may be configured to receive the location signal from the 
comparator and to advance the storage device to the location indicated by the location signal. 
The storage device may comprise a compact disc, a digital video disc, a VCR, an audio tape, or 
other storage device suitable for storing the input sequence. 

25 
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DECLARATION AND POWER OF ATTORNEY FOR 
PATENT APPLICATION 

As a below named inventor, I hereby declare that: 

My residence, post office and citizenship are as stated below next to my name; 

I believe I am the original, first and sole inventor (if only one name is listed below) or an 
original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed 
and for which a patent is sought on the invention entitled Method and System of Audio File 
Searching . 

the specification of which: 

X is attached hereto. 

was filed on as Application Serial No. 

and was amended on . 

(if applicable) 

I hereby state that I have reviewed and understand the contents of the above identified specification, 
including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the patentability of this application 
in accordance with Title 37, Code of Federal Regulations, 1.56. 

I hereby claim foreign priority benefits under Title 35, United States Code, 119 of any foreign 
application(s) for patent or inventor's certificate listed below and have also identified below any foreign 
application for patent or inventor's certificate having a filing date before that of the application on 
which priority is claimed: 

PRIOR FOREIGN APPLICATION(S) Priority Claimed 



N/A Yes/No 

(Number) (Country) (Date Filed) 

N/A Yes/No 

(Number) (Country) (Date Filed) 



I hereby claim the benefit under Title 35, United States Code, 120 of any United States application(s) 
listed below and, insofar as the subject matter of each of the claims of this application is not disclosed 
in the prior United States application in the manner provided by the first paragraph of Title 35, United 
States Code, 112, I acknowledge the duty to disclose information which is material to the patentability 
of this application as defined in Title 37, Code of Federal Regulations, 1.56, which occurred between 
the filing date of the prior application and the national or PCT international filing date of this 
application: 

N/A 

(Application Serial No.) (Filing Date) (Status) 

N/A 

(Application Serial No.) (Filing Date) (Status) 
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I hereby declare that all statements made herein of my own knowledge are true and that all statements 
made on information and belief are believed to be true; and further that these statements were made 
with the knowledge that willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful 
false statements may jeopardize the validity of the application or any patent issued thereon. 

POWER OF ATTORNEY: As a named inventor I hereby appoint the following attorneys and/or 
agents to prosecute this application and transact all business in the Patent and Trademark Office 
connected therewith. 



John W. Henderson, Jr., Reg. No. 26,907; James H. Barksdale, Jr., Reg. No. 24,091; Thomas E. 
Tyson, Reg. No. 28, 543; Robert M. Carwell, Reg. No. 28,499; Jeffrey S. LaBaw, Reg. No. 31,633; 
Douglas H. Lefeve, Reg. No. 26,193; Casimer K. Salys, Reg. No. 28,900; David A. Mims, Jr., Reg. 
No. 32,708; Mark E. McBurney, Reg. No. 33,114; Anthony V. England, Reg. No. 35,129; Volel 
Emile, Reg. No. 39,969; Leslie A. Van Leeuwen, Reg. No. 42,196; Christopher A. Hughes, Reg. No. 
26,914; Edward A. Pennington, Reg. No. 32,588; John E. Hoel, Reg. No. 26,279; Joseph C. 
Redmond, Jr., Reg. No. 18,753; Marilyn S. Dawkins, Reg. No. 31,140; Joseph P. Lally, Reg. No. 
38,947; and Raman N. Dewan, Reg. No. 38,787. 



Send correspondence to: 



and direct all telephone calls to: 



Joseph P. Lally 
Dewan & Lally, l.l.p. 
P.O. Box 684749 
Austin, Texas 78768-4749 

(512) 428-9870 



FULL NAME OF SOLE OR FIRST INVENTOR: Jason Raymond Baumgartner 

INVENTOR'S SIGNATURE: ()^.__ j — DATE: )/^\J ^'^'^^ 

RESIDENCE: 14936 Purslane Meadow Trail, Austin, TX 78728 

CITIZENSHIP: U.S. 

POST OFFICE ADDRESS: Same as above 



FULL NAME OF SECOND INVENTOR: Nadeem Malik 



INVENTOR'S SIGNATURE 

RESIDENCE: 8217 Crabtree Drive, Austin, TX 78750 
CITIZENSHIP: Pakistan 
POST OFFICE ADDRESS: Same as above 



\^)kas\yvJ(>V. DATE: V|'S\ |>^?Zrb 



FULL NAME OF THIRD INVENTOR: Steven Leonard Roberts 

INVENTOR'S SIGNATURE: J^^j^^^^ DATE: /f/ qdS^ 

RESIDENCE: 6103 Diamond Head Drive, Austin, TX 78746 

CITIZENSHIP: U.S. 

POST OFFICE ADDRESS: Same as above 
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